Find many more hands-on GenAI in my Udemy course, Hands-on Generative AI Engineering with Large Language Model 👇
🤝 Merge algorithms
In this section, we will focus on four methods currently implemented in mergekit. Note that there are other methods, such as linear and Task Arithmetic. If you’re interested in papers on model merging, I recommend this excellent collection on Hugging Face.
Example of configuration:
slices:
- sources:
- model: OpenPipe/mistral-ft-optimized-1218
layer_range: [0, 32]
- model: mlabonne/NeuralHermes-2.5-Mistral-7B
layer_range: [0, 32]
merge_method: slerp
base_model: OpenPipe/mistral-ft-optimized-1218
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16💻 Merge your own models
In this section, we will use mergekit to load a merge configuration, run it, and upload the resulting model to the Hugging Face Hub.
First of all, we install mergekit directly from source as follows:
!git clone https://github.com/cg123/mergekit.git
!cd mergekit && pip install -q -e .In the following block, we load the merge configuration in a YAML format. We also specify the name of the merged model for future use. You can copy/paste any configuration from the previous section here.